Assignment Project Exam Help

https://eduassistpro.github.io/

Add WeChat edu\_assist\_pro

#### Review

- We would like to have the capacity of disk at the speed of the processor: unfortunately this is not feasible
- So we create a memory nierarchy: Exam Help
  - each successively lo https://eduassistpro.githubeid// data from next lower level
  - exploits <u>temporal locality</u>

    Add WeChat edu\_assist\_pro
  - do the common case fast, worry less about the exceptions (design principle of MIPS)
- Locality of reference is a Big Idea

#### Big Idea Review

- Mechanism for transparent movement of data among levels of a storage hierarchy
  - set of address/walue pindingsExam Help
  - address pro didates
  - compare dehttps://eduassistpro.github.io/
  - service hit or miss • load new block and binding

address: index offset tag 00000000000000000 000000001

| Index      | Valid | Tag | 0x0 - 0x3 | 0x4 - 0x7 | 0x8 - 0xb | Oxc - Oxf |
|------------|-------|-----|-----------|-----------|-----------|-----------|
| 0          |       |     |           |           |           |           |
| <b>→</b> 1 | 1     | 0   | а         | b         | С         | d         |
| 2          |       |     |           |           |           |           |
| 3          |       |     |           |           |           |           |

#### Outline

- Block Size Tradeoff
- Types of Cache Misses Assignment Project Exam Help
- Fully Associative Cac https://eduassistpro.github.io/
- N-Way Associative C
   Add WeChat edu\_assist\_pro
- Block Replacement Policy
- Multilevel Caches (if time)
- Cache write policy (if time)

### Block Size Tradeoff (1/3)

- Benefits of Larger Block Size
  - Spatial Locality: if we access a given word, we're likely to access other nearby words soon (Another Big Idea)
  - Very applicable with https://eduassistpro.gethubfiq/e execute a given instruction, it's likely that we'll execute assist pro assist pro
  - Works nicely in sequential array accesses too

### Block Size Tradeoff (2/3)

- Drawbacks of Larger Block Size
  - Larger block size means larger miss penalty
     Assignment Project Exam Help
     on a miss, takes longer time to load a new block from next level
  - If block size is too bi https://eduassistpro.gjthubnithere are too few blocks Add WeChat edu\_assist\_pro
    - Result: miss rate goes up
- In general, minimize
  - Average Access Time
    - = Hit Time + Miss Penalty x Miss Rate

### Block Size Tradeoff (3/3)

- Hit Time = time to find and retrieve data from current level cache
- Miss Penalty = avera data on a current level miss (includes the p https://eduassistpro.githubio/essive levels of memory hierarchy) Add WeChat edu\_assist\_pro
- Hit Rate = % of requests that are found in current level cache
- Miss Rate = 1 Hit Rate

Gill COMP 273

#### **Block Size Tradeoff Conclusions**



### Types of Cache Misses (1/2)

- Compulsory Misses
  - occur when a program is first started
     Assignment Project Exam Help
     cache does not cont 's day
  - cache does not cont 's data yet, so misses are bound to occur https://eduassistpro.github.io/
  - can't be avoided easily appropriate edu\_assists in this course

# Types of Cache Misses (2/2)

#### Conflict Misses

- miss that occurs because two distinct memory addresses map to the same cache location
- two blocks (which h https://eduassistpro.githebdoation) can keep overwriting each other Add WeChat edu\_assist\_pro
- big problem in direct-mapped caches
- how do we lessen the effect of these?

#### Dealing with Conflict Misses

Solution 1: Make the cache size bigger

relatively expensive

Assignment Project Exam Help

Solution 2: Multiple

it in the same Cache

Index?

https://eduassistpro.github.io/

Add WeChat edu\_assist\_pro

# Fully Associative Cache (1/3)

- Memory address fields:
  - Tag: same as before

Assignment Project Exam Help

- Offset: same as befo

- Index: non-existent https://eduassistpro.github.io/
- What does this mean WeChat edu\_assist\_pro
  - any block can go anywhere in the cache
  - must compare with all tags in entire cache to see if data is there

# Fully Associative Cache (2/3)

Fully Associative Cache (e.g., 32 B block)



# Fully Associative Cache (3/3)

- Benefit of Fully Assoc Cache
  - No Conflict Misses (since data can go anywhere)
     Assignment Project Exam Help
- Drawbacks of Fully
  - Need hardware com <a href="https://eduassistpro.github.io/">https://eduassistpro.github.io/</a>
     le entry:
    - If we have a 64KB of chedin washed edu\_assist, we heed 16K comparators: very expensive
- Small fully associative cache may be feasible

#### Third Type of Cache Miss

- Capacity Misses
  - miss that occurs because the cache has a limited size Assignment Project Exam Help
     miss that would not e size of the cache has a limited size of the cache has a lim
  - e size of the cache
  - sketchy definition, s https://eduassistpro.github.io/
- This is the primary typedown its the primary typedown is the primary typedow

# N-Way Set Associative Cache (1/4)

- Memory address fields:
  - Tag: same as before Assignment Project Exam Help

    - Offset: same as befo

  - Index: points us to t https://eduassistpro.githubeio/n this case)
- So what's the difference? We Chat edu\_assist\_pro
  - each set contains multiple blocks
  - once we've found correct set, must compare with all tags in that set to find our data

# N-Way Set Associative Cache (2/4)

#### • Summary:

- cache is direct-mapped with respect to sets
   Assignment Project Exam Help
   each set is fully asso
- If we have T blocks t https://eduassistpro.githwe.igh T/N directmapped cache, where at expercinal edu\_assist fully associative N block cache. Each has its own valid bit and data.

# N-Way Set Associative Cache (3/4)

- Given memory address:
  - Find correct set using Index value.
     Assignment Project Exam Help
     Compare Tag with al ermine
  - ermined set.
  - If a match occurs, it' https://eduassistpro.github.io/
  - Finally, use the offset Aid We Chat edu\_assist properties data within the desired block.

# N-Way Set Associative Cache (4/4)

- What's so great about this?
  - even a 2-way set associative cache avoids a lot of conflict misses
     Assignment Project Exam Help
     hardware cost isn't t
  - omparators
- https://eduassistpro.github.io/ In fact, for a cache
  - it's Direct-Mapped if Add WaChat edu\_assist\_(ArBlock per set)
  - it's Fully Associative if it's M-way set associative (M blocks per set)
  - so these two are just special cases of the more general set associative design

# Block Replacement Policy (1/2)

- N-Way Set Assoc (N a set, but block can occupy any position https://eduassistpro.githus.io/
- Fully Associative: blocked mesewedu\_assist\_any position (there is no index)
- Question: if we have the choice, where should we write an incoming block?

# Block Replacement Policy (2/2)

- Solution!
- If there are any locations with valid bit off (empty), then usually write the heighbook Projecthering Holie.
- If all possible locatio https://eduassistpro.galidbilock, we must use a replacement policy by which ine which block gets "cached out" on a miss. WeChat edu\_assist\_pro

#### Block Replacement Policy: LRU

- LRU (Least Recently Used)
  - Idea: cache out block which has been accessed (read or write) least recently

    Assignment Project Exam Help
  - Pro: temporal localit https://eduassistpro.gitbliesibkely future use: in fact, this is a very effective policy that edu\_assist\_pro
  - Con: with 2-way set assoc, easy to keep track (one LRU bit); with 4-way or greater, requires complicated hardware and much time to keep track of this

#### Block Replacement Example

• We have a 2-way set associative cache with a four word total capacity and one word blocks. We perform the following word accesses (ignore byt exam Help:

0, 2, 0, 1, 4, 0, 2, https://eduassistpro.github.io/

How many hits and how Warpat edu\_assigt there for the LRU block replacement policy?

#### Block Replacement Example: LRU loc 0 loc 1 • Addresses 0, 2, 0, 1, 4, 0, ... set 0 0: miss, bring into set 0 (loc 0) set 1 0 remainder 2 is 0, so set 0 set 0 Iru 2: miss, bring into set 0 (loc 1) set 1 2 remainder 2 is <sup>0</sup>Assignment Project Exam Help dru set 0 https://eduassistpro.github.ig/t 1 dru 🤈 1: miss, bring hto set at edu\_assist\_proet 0 Iru set 1 1 remainder 2 is 1, so set 1 set 0 Iru 4: miss, bring into set 0 (loc 1, replace 2) lru set 1 4 remainder 2 is 0, so set 0 ıru\_\_ set 0 0: hit Ilru set 1

#### Ways to reduce miss rate

- Larger cache
  - limited by cost and technology
  - hit time of first lever signmenty Project Exam Help
- More places in the ca <a href="https://eduassistpro.github.lb">https://eduassistpro.github.lb</a> associativity Add WeChat edu\_assist\_pro
  - fully-associative
    - any block any line
  - k-way set associated
    - k places for each block
    - direct map: k=1

#### Big Idea

- How do we chose between options of associativity, block size, replacement policy?
- Design against a per Assignment Project Exam Help
  - Minimize: Average Ahttps://eduassistpro.github.io/
  - = Hit Time + Miss Panaltwx Miss edu\_assist\_pro
     influenced by technology and pro vior

#### Example

- Assume
  - Hit Time = 1 cycle Assignment Project Exam Help
  - Miss rate = 5%
  - Miss penalty = 20 cy https://eduassistpro.github.io/
- Average memory access the chat edu\_assist\_poro = 2 cycle

### Improving Miss Penalty

When caches first became popular,
 Miss Penalty ~ 10 processor clock cycles

• Today: 1000 MHz Processor (1 ns per clock cycle) Angline togeto Example of the process



Solution: another cache between memory and the

processor cache: Second Level (L2) Cache

#### Analyzing Multi-level cache hierarchy



Add WeChat edu\_assist\_pro

Avg Mem Access Time = L1 Hit Time + L1 Miss Rate \* L1 Miss Penalty

L1 Miss Penalty = L2 Hit Time + L2 Miss Rate \* L2 Miss Penalty

Avg Mem Access Time = L1 Hit Time +
L1 Miss Rate \* (L2 Hit Time + L2 Miss Rate \* L2 Miss Penalty)

#### Typical Scale

- L1
  - size: tens of KB
  - hit time: complete in spieghard project Exam Help
  - miss rates: 1-5%
- L2 https://eduassistpro.github.io/
  - size: hundreds of KB Add WeChat edu\_assist\_pro
  - hit time: few clock cycles
  - miss rates: 10-20%
- L2 miss rate is fraction of L1 misses that also miss in L2
  - why so high?

#### Example: without L2 cache

- Assume
  - L1 Hit Time = 1 cycle
     Assignment Project Exam Help
  - -L1 Miss rate = 5%
  - L1 Miss Penalty = 10 https://eduassistpro.github.io/
- Average memory access the edu\_assist\_pgg
   = 6 cycles

#### Example with L2 cache

- Assume
  - L1 Hit Time = 1 cycle
  - L1 Miss rate = 5% Assignment Project Exam Help
  - L2 Hit Time = 5 cycles https://eduassistpro.github.io/
  - L2 Miss rate = 15% (%
  - L2 Miss Penalty = 100 AddsWeChat edu\_assist\_pro
- L1 miss penalty = 5 + 0.15 \* 100 = 20
- Average memory access time =  $1 + 0.05 \times 20$ = 2 cycle

3x faster with L2 cache

#### What to do on a write hit?

- Write-through
  - update the word in cache block and corresponding word in memory Assignment Project Exam Help
- Write-back
  - update word in cach https://eduassistpro.github.io/

  - allow memory word to be "stale"
    add WeChat edu\_assist\_pro
    add 'dirty' bit to each line indicati updated when block is replaced
  - OS flushes cache before I/O !!!
- Performance trade-offs?

### "And in conclusion..." (1/2)

- Caches are NOT mandatory:
  - Processor performs arithmetic
  - Memory stores data

    Assignment Project Exam Help
  - Caches simply make d https://eduassistpro.github.io/
- Each level of memory higher level
- Caches speed up due to temporal locality: store data used recently
- Block size > 1 word speeds up due to spatial locality: store words adjacent to the ones used recently

### "And in conclusion..." (2/2)

- Cache design choices:
  - size of cache: speed v. capacity
  - direct-mapped v. Assignment Project Exam Help
  - for N-way set assoc: <a href="https://eduassistpro.github.io/">https://eduassistpro.github.io/</a>
  - block replacement policy Add WeChat edu\_assist\_pro
  - 2nd level cache?
  - Write through v. write back?
- Use performance model to pick between choices, depending on programs, technology, budget, ...

#### A real example

And additional reading (for fun):

http://igoro.com/archive/gallery-of-processor-cache-effects/

Assignment Project Exam Help

https://eduassistpro.github.io/

Add WeChat edu\_assist\_pro

#### Assignment Project Exam Help

https://eduassistpro.github.io/

Add WeChat edu\_assist\_pro

```
average time loop 1 = 126858657.625
average time loop 2 = 71715726.125
ratio is 1.7689098957721807
but first loop does 32 times more work!!
Loop 1 gets work done 18 times faster!
```

```
A=[
                                        0 1 240591499
                                        1 2 134307003
                                        2 4 84736089
                                        3 8 74437939
                                        4 16 70215291
                                        5 32 73695400
                                        6 64 52077957
                                        7 128 19758427
                                        8 256 10488407
                                        9 512 6369311
                                        10 1024 3736937
                                        plot(A(:,1),A(:,3));
Assignment Project Example ("long in a cache block?");
                                        xlabel('exponent of size');
       https://eduassistpro.github.io/cache block?
                                                    2<sup>4</sup> = 16 words = 64 bytes
       Add WeChat edu_assist_pro
                                    time
                                      0.5
                                               2
                                                       exponent of size
```



#### Review and More Information

• Sections 5.3 - 5.4 of textbook

Assignment Project Exam Help

https://eduassistpro.github.io/

Add WeChat edu\_assist\_pro